Hedge Detection Using the RelHunter Approach

نویسندگان

  • Eraldo Rezende Fernandes
  • Carlos E. M. Crestana
  • Ruy Luiz Milidiú
چکیده

RelHunter is a Machine Learning based method for the extraction of structured information from text. Here, we apply RelHunter to the Hedge Detection task, proposed as the CoNLL-2010 Shared Task1. RelHunter’s key design idea is to model the target structures as a relation over entities. The method decomposes the original task into three subtasks: (i) Entity Identification; (ii) Candidate Relation Generation; and (iii) Relation Recognition. In the Hedge Detection task, we define three types of entities: cue chunk, start scope token and end scope token. Hence, the Entity Identification subtask is further decomposed into three token classification subtasks, one for each entity type. In the Candidate Relation Generation subtask, we apply a simple procedure to generate a ternary candidate relation. Each instance in this relation represents a hedge candidate composed by a cue chunk, a start scope token and an end scope token. For the Relation Recognition subtask, we use a binary classifier to discriminate between true and false candidates. The four classifiers are trained with the Entropy Guided Transformation Learning algorithm. When compared to the other hedge detection systems of the CoNLL shared task, our scheme shows a competitive performance. The F -score of our system is 54.05 on the evaluation corpus. ∗ This work is partially funded by CNPq and FAPERJ grants 557.128/2009-9 and E-26/170028/2008. † Holds a CNPq doctoral fellowship and has financial support from IFG, Brazil. ‡Holds a CAPES doctoral fellowship. §Holds a CNPq research fellowship. Closed Task 2: detection of hedge cues and their scopes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Gold as a Hedge or Safe Haven for the Stock Market by a Markov Switching Approach

Although gold is no longer a central cornerstone of the international monetary and financial system, it still attracts considerable attention from researchers and investors. Nowadays, many investors manage their risk with valuable assets such as gold. This paper examines the dynamic relationships between gold and stock markets in the Tehran Stock Exchange. We have applied the Markov switching m...

متن کامل

Predicting speculation: a simple disambiguation approach to hedge detection in biomedical literature

BACKGROUND This paper presents a novel approach to the problem of hedge detection, which involves identifying so-called hedge cues for labeling sentences as certain or uncertain. This is the classification problem for Task 1 of the CoNLL-2010 Shared Task, which focuses on hedging in the biomedical domain. We here propose to view hedge detection as a simple disambiguation problem, restricted to ...

متن کامل

Hedge Detection and Scope Finding by Sequence Labeling with Normalized Feature Selection∗

This paper presents a system which adopts a standard sequence labeling technique for hedge detection and scope finding. For hedge detection, we formulate it as a hedge labeling problem, while for hedge scope finding, we use a two-step labeling strategy, one for hedge labeling and the other for scope finding. In particular, various kinds of syntactic dependencies are systemically exploited and e...

متن کامل

Detecting uncertainty in biomedical literature: a simple disambiguation approach using sparse random indexing

This paper presents a novel approach to the problem of hedge detection, which involves the identification of so-called hedge cues for labeling sentences as certain or uncertain. This is the classification problem for Task 1 of the CoNLL-2010 Shared Task, which focuses on hedging in biomedical literature. We here propose to view hedge detection as a simple disambiguation problem, restricted to w...

متن کامل

Hedge Scope Detection in Biomedical Texts: An Effective Dependency-Based Method

Hedge detection is used to distinguish uncertain information from facts, which is of essential importance in biomedical information extraction. The task of hedge detection is often divided into two subtasks: detecting uncertain cues and their linguistic scope. Hedge scope is a sequence of tokens including the hedge cue in a sentence. Previous hedge scope detection methods usually take all token...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010